Do wordnets also improve human performance on NLP tasks?
نویسندگان
چکیده
FinnWordNet is a wordnet for Finnish that complies with the format of the Princeton WordNet (PWN) (Fellbaum, 1998). It was built by translating the Princeton WordNet 3.0 synsets into Finnish by human translators. It is open source and contains 117000 synsets. The Finnish translations were inserted into the PWN structure resulting in a bilingual lexical database. In natural language processing (NLP), wordnets have been used for infusing computers with semantic knowledge assuming that humans already have a sufficient amount of this knowledge. In this paper we present a case study of using wordnets as an electronic dictionary. We tested whether native Finnish speakers benefit from using a wordnet while completing English sentence completion tasks. We found that using either an English wordnet or a bilingual English-Finnish wordnet significantly improves performance in the task. This should be taken into account when setting standards and comparing human and computer performance on these tasks.
منابع مشابه
Beyond the Transfer-and-Merge Wordnet Construction: plWordNet and a Comparison with WordNet
Wordnets are lexico-semantic resources essential in many NLP tasks. Princeton WordNet is the most widely known, and the most influential, among them. Wordnets for languages other than English tend to adopt unquestioningly WordNet’s structure and its net of lexicalised concepts. We discuss a large wordnet constructed independently of WordNet, upon a model with a small yet significant difference....
متن کاملIntroduction to Tools for IndoWordNet and Word Sense Disambiguation
Lexically rich resources form the foundation to all NLP tasks. Maintaining the high quality of resources is thus a high priority issue. In this paper we exhibit the tools developed at IIT Bombay, for the purpose of creation, enhancement and maintenance of the WordNets, as well as the ones used for NLP tasks that use WordNets directly, like Word Sense Disambiguation. The paper presents online an...
متن کاملDeveloping Parallel Sense-tagged Corpora with Wordnets
Semantically annotated corpora play an important role in natural language processing. This paper presents the results of a pilot study on building a sense-tagged parallel corpus, part of ongoing construction of aligned corpora for four languages (English, Chinese, Japanese, and Indonesian) in four domains (story, essay, news, and tourism) from the NTU-Multilingual Corpus. Each subcorpus is firs...
متن کاملCross-Lingual Validation of Multilingual Wordnets
Incorporating Wordnet or its monolingual followers in modern NLP-based systems already represents a general trend motivated by numerous reports showing significant improvements in the overall performances of these systems. Multilingual wordnets, such as EuroWordNet or BalkaNet, represent one step further with great promises in the domain of multilingual processing. The paper describes one possi...
متن کاملTowards A Universal Index Of Meaning
The Inter Lingual Index ILI in the EuroWordNet architecture is an initially unstructured fund of con cepts which functions as the link between the vari ous language wordnets The ILI concepts originate fromWordNet and have been restructured on the basis of aspects of the internal structure of Word Net links between WordNet and other resources and multilingual mapping between the wordnets This le...
متن کامل